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ABSTRACT 


- This  paper  investigates  the  conditions  under  which  a 

discrete  optimization  problem  can  be  formulated  as  a  dynamic  pro¬ 
gram.  Following  the  terminology  of  (Karp  and  Held  1957)  ,  a 
discrete  optimization  problem  is  formalized  as  a  discrete  deci¬ 
sion  problem  and  the  class  of  dynamic  programs  is  formalized  as  a 
sequential  decision  process.  Necessary  and  sufficient  conditions 
for  the  representation  in  two  different  senses  of  a  discrete  de¬ 
cision  problem  by  a  sequential  decision  process  are  established. 
In  the  first  sense  (a  strong  representation)  the  set  of  all  op¬ 
timal  solutions  to  the  discrete  optimization  problem  is  obtain¬ 
able  from  the  solution  of  the  functional  equations  of  dynamic 
programming.  In  the  second  sense  (a  weak  representation)  a 
nonempty  subset  of  optimal  solutions  is  obtainable  from  the  solu¬ 
tion  of  the  functional  equations  of  dynamic  programming.  It  is 
shown  that  the  well  known  principle  of  optimality  corresponds  to 
a  strong  representation.  A  more  general  version  of  the  principle 
of  optimality  is  given  which  corresponds  to  a  weak  representation 
of  a  discrete  decision  problem  by  a  sequential  decision  process. 
We  also  show  that  the  class  of  strongly  representable  discrete 
decision  problems  is  equivalent  to  the  class  of  sequential  deci¬ 
sion  prcesses  which  have  cost  functions  satisfying  a  strict  mono¬ 
tonicity  condition.  Also  a  new  derivation  is  given  of  the  result 
that  the  class  of  weakly  representable  discrete  decision  problems 
is  equivalent  to  the  class  of  sequential  decision  processes  which 
have  a  cost  function  satisfying  a  monotonicity  condition. 


1.  Introduction 


Dynamic  programming  has  proven  to  he  one  of  the  principal 
methods  for  the  formulation  and  solution  of  discrete  optimization 
problems.  A  number  of  studies  have  explored  the  extent  to  which 
dynamic  programming  is  applicable  to  such  problems,  including 
(Mitten  1954,  Held  and  Karp  1957,  Elmaghraby  1970,  Bonzon  1970, 
Ibaraki  1972,1973,  and  other  cited  in  the  references).  A  recent 
survey  of  solution  techniques  and  applications  of  dynamic  pro¬ 
gramming  appears  in  (Morin  1978).  Mitten  was  the  first  to  point 
out  the  essential  role  that  the  monotonicity  of  the  cost  function 
plays  in  a  dynamic  program.  Subsequently,  (Held  and  Karp  1957) 
studied  dynamic  programs  in  terms  of  a  finite  state  machine  with 
a  superimposed  cost  structure  (an  sdp  as  defined  below) ,  and 
attacked  the  problem  of  characterizing  the  representations  of  a 
discrete  optimization  problem  by  a  sdp  with  a  monotonic  cost 
function . 

In  this  paper  the  notion  of  a  discrete  optimization  problem 
is  formalized  as  a  discrete  decision  problem  (ddp)  and  the  gen¬ 
eral  setting  within  which  the  functional  equations  of  dynamic 
programming  can  be  applied  is  formalized  as  a  sequential  decision 
process  (sdp)  following  along  the  general  lines  of  (Karp  and  Held 
1967).  Necessary  and  sufficient  conditions  for  the  representa¬ 
tion  in  two  different  senses  of  a  ddp  by  a  sdp  are  established  in 
theorems  2  through  7.  In  the  first  sense  (a  strong  representa¬ 
tion)  the  set  of  all  optimal  solutions  to  the  discrete  optimiza- 
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tion  problem  is  obtainable  from  the  solution  of  the  functional 
equations  of  dynamic  programming.  In  the  second  sense  (a  weak 
representation)  a  nonempty  subset  of  optimal  solutions  is  obtain¬ 
able  from  the  solution  of  the  functional  equations  of  dynamic 
programming.  It  is  shown  that  the  well  known  principle  of 
optimality  corresponds  to  a  strong  representation.  A  more  gen¬ 
eral  version  of  the  principle  of  optimality  is  given  which 
corresponds  to  a  weak  representation  of  a  ddp  by  a  sdp.  It  is 
shown  that  sdp's  having  a  strictly  monotonic  cost  function  are  in 
one  to  one  correspondence  with  strong  representations  of  ddp's. 
Finally  a  new  derivation  is  given  of  the  result  that  sdp's  having 
a  monotonic  cost  function  are  in  one-to-one  correspondence  with 
weak  representations  of  a  ddp. 

Our  notion  of  a  weak  representation  is  new  in  that  we  nei¬ 
ther  require  all  optimal  solutions  nor  the  correct  cost  of  the 
optimal  solutions,  but  are  satisfied  with  some  optimal  solutions. 
Presumeably  if  the  correct  costs  were  required,  one  could  compute 
the  cost  of  an  optimal  solution  using  the  cost  function  of  the 
ddp  after  they  have  been  found  by  some  method.  The  notion  of 
strong  representation  was  introduced,  along  with  an  even  stronger 
sense  of  representation,  in  (Ibaraki  1972). 

2.  Definitions. 


A  discrete  decision  problem  is  intended  as  a  general  model 
of  combinatorial  optimization  problems.  A  discrete  decision 
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problem  is  a  system  D=(A,S,P,f)  where 

* 

,'Y  A  is  a  finite  nonempty  alphabet  (set  of  primitive  deci¬ 

sions)  , 

SCA*  (set  of  feasible  policies), 

Pisa  set  (the  set  of  data  inputs  for  the  problem) , 
f:SxP-»R  where  R  is  the  set  of  positive  reals,  (cost  or 
objective  function) . 

An  instance  of  a  discrete  decision  problem  D,  denoted  D(p), 
is  given  by  a  particular  data  input  pSP.  A  policy  s6S  is  optimal 
with  respect  to  input  pGP  if  VtGS  f (s ,p) <f (t ,p) .  The  set  of 
optimal  policies  for  the  problem  instance  D(p)  is  denoted  0(D,p). 
We  will  be  interested  in  the  conditions  under  which  the  problem 
of  finding  0(D,p)  or  a  subset  of  0(D,p)  can  be  formulated  by  a 
dynamic  program. 

One  of  the  simplest  discrete  decision  problems  is  the  prob¬ 
lem  of  finding  the  least  cost  path  from  the  start  node  to  a  goal 
node  in  an  arc-weighted  directed  graph.  This  problem  can  be 
represented  as  a  ddp  as  follows;  let  A  be  the  set  of  arcs  (i,j) 
in  the  graph  where  (i,j)  represents  the  decision  to  move  from 
node  i  to  node  j,  S  is  then  the  set  of  sequences  of  arcs  which 
move  from  the  start  node  to  a  final  node,  P  is  the  set  of  cost 
matrices  (p^j)  where  Pi  f  j  is  the  cost  of  arc  (i,j),  and  finally 
f(s,p)  is  the  cost  of  arc  sequence  (path)  s  with  respect  to  input 

p;  more  precisely,  f(s,p)  =  £  p«  i. 

( i  .  -i  )  1 '  -i 


The  functional  equations  of  dynamic  programming  apply  to  a 
kind  of  process  called  a  sequential  decision  process.  A  sequen¬ 
tial  decision  process  (sdp)  is  a  system  11=  (A,Q ,q0  ,Qf  ,T , h  ,k  ,  P) 
where 

A  is  a  finite  nonempty  alphabet  (set  of  primitive  deci¬ 
sions)  , 

Q  is  a  set  (set  of  states)  , 
q06Q  (start  state) , 

QfCQ  (set  of  final  states), 
t:QxA->0  (transition  function)  , 
h:RxQxAxP-*R  (cost  or  objective  function), 
k:P-»R  (initial  cost  function), 

P  is  a  set  (input  data  specifications). 

The  transition  function  t  applies  a  decision  aGA  to  a  state  qGQ 
resulting  in  a  transition  to  a  new  state  t(q,a).  We  can  extend 
the  domain  of  t  to  QxA*  by  the  following  recursive  definition: 
let  t(q,e)=q  for  qGQ,  where  e  is  the  empty  sequence, 
t (q , xa ) =t ( t  (q , x) , a)  for  q6Q,  xGA*,  and  aGA.  Thus  t(q,xa)  is  the 
state  resulting  from  applying  the  decision  sequence  xa  to  the 
initial  state  q.  When  only  one  argument  is  given  to  t  the  path 
will  be  assumed  to  originate  at  the  start  state,  thus  t(x)  is  the 
state  resulting  from  applying  the  decision  sequence  x  from  the 
start  state.  Let  F  (IT)  =  { x  1 1  (x)  GQ^ }  .  xGF(IT)is  a  feasible  decision 
sequence  which  t  maps  (by  definition)  from  q0  to  some  final  state 
qfGOf.  Note  that  the  first  five  components  of  a  discrete  deci¬ 
sion  problem  comprise  a  finite  state  automaton  (Hopcroft  and  Ull- 
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man  1969).  The  cost  function  h(c,q,a,p)  is  the  cost  of  reaching 
state  t(q,a)  by  a  sequence  reaching  state  q  with  cost  c  which  is 
extended  by  decision  a.  The  initial  cost  function  k(p)  is  the 
cost  of  a  null  sequence  given  input  p.  It  will  be  useful  to  con¬ 
sider  the  special  case  of  decision  sequences  applied  to  the  start 
state  as  follows:  let  g(e,p)=k(p),  g(xa,p)  =  h (g (x, p) , t (x) , a ,p) 
for  xGA*,  a€A,  pSP.  Thus  g{x,p)  gives  the  cost  of  reaching  state 
t(x)  from  q0  by  means  of  the  sequence  of  decisions  x.  Finally 
since  we  are  interested  in  optimal  decision  sequences  let  us 
define  (and  assume  the  existence  of)  G(qs,p)=k(p)  and  G(q,p)  = 

min  g  (x,p)  for  all  q/q_,  p€P,  thus  G(q,p)  is  the  cost  of 

{ x I t ( x) =q }  s 

the  least  cost  decision  sequence  reaching  state  q  from  qQ.  We 
say  xGA*  is  an  optimal  decision  sequence  reaching  state  c[  if 
t(x)=q  and  G (q , p) ~a  ( x , p) .  The  set  of  optimal  decision  sequences 
reaching  a  final  state  of  It  are  denoted  O(ITfP).  Note  that  O(IlrP) 
is  always  nonempty  since  there  is  at  least  one  least  cost 
sequence  reaching  each  final  state  of  II.  A  sdp  IT  represents  a 
ddp  D  if  F(1T)=S  and  0  (11.  P)  CO  (D ,  p)  . 

3 .  Representations  of  a  discrete  decision  problem . 

Before  turning  to  our  primary  problem  of  characterizing  the 
representations  of  a  ddp  by  a  dynamic  program,  we  give  necessary 
and  sufficient  conditions  for  the  representation,  as  defined 


above,  of  a  ddp  by  an  sdp.  We  first  summarize  some  concepts  and 
results  on  finite  automata  (Hopcroft  and  Ullman  1969)  which  will 


be  needed  only  in  the  present  section.  The  equ i response  relation 
of  a  finite  automaton  is  defined  by  the  relation  xRy  iff 
t(x)=t(y)  for  all  x,yGA*.  An  equivalence  relation  R  on  A*  is 
called  right  invariant  if  xRy  ->  (VzGA*) xzRyz .  If  R  and  T  are 

equivalence  relations  on  A*  then  R  refines  T  if  Vx,yGA*  xRy  -» 

xTy.  An  equivalence  relation  has  finite  rank  if  it  has  only  a 
finite  number  of  equivalence  classes.  Note  that  the  equiresponse 
relation  on  a  finite  automaton  is  right  invariant  since  t(x)=t(y) 
->  t(xz)  =  t(t(x),z)  =  t(t(y),z)  =  t(yz).  Finally  for  some  SCA* 
define  the  equivalence  relation  Rs  as  follows: 
xRsy  iff  (VzGA* )  xzGS  yzGS. 

The  following  lemma  gives  us  an  essential  property  of  finite 
automata . 

Proposition  1_.  Let  SCA*  and  let  R  be  a  riqht  invariant 

equivalence  relation  of  finite  rank,  then  R  is  the  equiresponse 

relation  of  a  finite  automaton  which  accepts  S  iff  R  refines  Rg. 

proof:  see  (Hopcroft  and  Ullman  1969;  pp  29) . 

Theorem  iL.  A  sdp  17=  ( A  ,  Q ,  qg  ,  , T ,  h  ,  k  ,  P)  represents  a  ddp 

D= ( A , S , P , f )  iff  the  following  conditions  hold: 

1.  the  equivalence  relation  R  defined  by  xRy  iff  t(x)=t(y)  for 
x,yGA*  is  a  right  invariant  equivalence  relation  of  finite 
rank  which  refines  Rg. 

2.  (VpGP) (Ex  s.t.  tfxJGQjrMVy  s.t.  t(y)GQf)  g  (y ,  p)  <g  (x  ,  p) 
yGO (D , p ) . 
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proof:  (if):  Suppose  that  conditions  1  and  2  hold.  By  proposi¬ 
tion  1,  R  is  the  equiresponse  relation  of  a  finite  automaton 
which  accepts  the  language  S,  so  F  (Tl)  =S .  Let  x  satisfy  condition 
2,  so  (Vy6S  s.t.  t(y)6Qf)  g (y,p) <g (x,p)  -»  y60(D,p).  Let 
yeO(ILp)  so  (Vy  s.t.  t(y)6Qf)  g (y , p) <g (y , p)  -»  g (y , p) <g  (x , p)  -» 
yeO(h,p)  thus  0(Il,p)C0{D,p)  . 

(only  if):  Suppose  now  that  tl  represents  D,  so  F  (XT)  =?S  and 
0 (TL P) CO (D ,p) .  R  is  the  equiresponse  relation  of  a  finite  auto¬ 
maton  which  accepts  S,  so  R  is  a  right  invariant  equivalence 
relation  of  finite  rank.  By  proposition  1,  R  refines  Rg ,  so  con¬ 
dition  1  holds.  Let  yGOfllrP)  then  (Vy  s.t.  t(y)6Qf) 
g  (y,p)  <g  (y,p)  -»  g(y,p)  =  g(y,p)  -»  yeo(II,p)  -»  yeo(D,p).  Thus 
condition  2  holds.  QED 

There  are  several  important  aspects  to  our  representations 
of  ddp's  by  sdp's  which  should  be  pointed  out.  In  mapping  from  a 
ddp  to  a  sdp,  we  assume  the  notion  of  a  state  (the  equivalence 
classes  of  R  in  theorem  1),  the  existence  of  the  transition  func¬ 
tion  t  which  only  depends  on  the  current  state  and  input  deci¬ 
sion,  and  a  cost  function  which  is  separable  in  the  sense  that 
the  cost  of  adding  a  transition  onto  the  end  of  a  sequence  only 
depends  on  the  current  state,  the  input  decision,  and  the  cost  of 
the  sequence  (in  general  the  cost  might  depend  on  all  previous 
decisions).  This  much  structure  is  implicit  in  the  concept  of  a 
dynamic  program.  A  closer  examination  of  these  assumptions  may 
be  found  in  (Elmaghraby  1970). 
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4.  Strong  representations  of  a  discrete  decision  problem. 

Our  purpose  is  to  discover  the  conditions  under  which  a  sdp 
n  represents  a  ddp  D  by  means  of  a  discrete  dynamic  program.  The 
principal  underlying  dynamic  programming  has  been  formulated  by 
Bellman  in  the  Principle  of  Optimality  (Bellman  1957)  and  can  be 
paraphrased  as  follows: 

An  optimal  sequence  has  the  property  that  no  matter  what  the 
next-to-last  state  and  the  next-to-last  decision  are  the  sequence 
reaching  the  next-to-last  state  must  be  optimal. 

This  version  of  the  principle  of  optimality  is  illustrated 
in  figure  la.  If  for  a€A,  x€A*  xa  is  an  optimal  sequence  from 
state  q0  to  qf  then  x  is  an  optimal  sequence  from  q0  to  q.  In 
general  the  principle  of  optimality  implies  that  if  xy,  for 
x,y€A*,  is  an  optimal  sequence  from  q0  to  qf  then  x  is  an  optimal 
sequence  from  qg  to  t(qnrx)  and  y  is  an  optimal  sequence  from 
t(q0,x)  to  q^  as  illustrated  in  fiqure  lb.  This  illustration 
applies  only  to  discrete  sequences  and  so  should  not  be  construed 
to  demonstrate  the  full  range  of  dynamic  programming  which  is 
much  broader. 

In  terms  of  an  sdp  the  principle  of  optimality  can  be  made 
precise  as  follows: 

(VpGP) (VxGA*) (VaGA)  G ( t (xa ) , p) =g  (xa  ,p)  -»  G ( t (x ) , p) =g (x , p)  (D 
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The  following  lemma  states  an  equivalent  form  for  (1).  Let 
IT=  (A,  Q  ,qQ  ,  ,T  ,h  ,k  ,  P)  be  a  sdp.  h  is  s  '-monotonic  if  for  all 
states  qGQ,  optimal  sequences  xa  reaching  state  q,  and  sequences 
ya  reaching  q,  we  have  g (x , p) <g (y , p)  <-»  g (xa ,p) <g (ya ,p) .  A  sdp 
containing  a  s*-monotonic  cost  function  is  a  s ’ -monotonic  sequen¬ 
tial  decision  process  (s'-msdp).  We  say  h  is  strictly  monotonic 

fc 

(s-monotonic)  if  for  all  xryGA  such  that  t(x)=t(y), 
g (x,p) <g (y,p)  -»  g  (xa  ,p)  <g  (ya  ,p)  .  A  sequential  decision  process 
which  contains  a  s-monotonic  cost  function  is  called  a  strictly 
monotonic  sequential  decision  process  (s-msdp) . 

Theorem  2.  (1)  holds  for  an  sdp  IT=  (A  ,Q  ,qQ  ,Qj  rT  ,h  ,k  ,  P)  iff  h  is 
s  '  -monoton i c . 

proof:  (only  if):  Suppose  that  (1)  holds  for  some  sdp  n  and 
that  h  is  not  s ' -monotoni c .  Let  xa  be  an  optimal  sequence  reach¬ 
ing  state  q  and  let  y  be  a  sequence  such  that  t(x)=t(y).  Suppose 
first  that  g (x , p) <g (y , p)  and  g  (xa , p) >g  (ya ,p) .  Since 
G (q,p) =g (xa ,p) >g (ya ,p) ,  we  have  g (xa rp) =g (ya ,p) .  By  (1), 
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G (q* ,p)=g (x,p)=g (y,p) ,  but  this  contradicts  our  assumption  that 
g (x,p) <g (y,p) .  Thus  g  (x,p) <g (y,p)  -■>  q  (xa  ,p)  <g  (ya  ,p)  .  Suppose 
instead  we  have  g  (xa  ,p)  <g  (ya  ,p)  but  g (x,p) >g (y,p) .  g (x ,p) /q (y , p) 
since  g  (xa  ,p)  ^g  (ya  ,p)  so  g  (x ,  p)  >g  (y ,  p)  .  But  by  (1)  and  our 
assumption  that  xa  is  an  optimal  sequence  reaching  q,  we  have 
G (q' ,p)=g (x,p) <g (y,p)  by  definition  of  G.  This  contradiction 
shows  that  g  (xa ,p) <g  (ya ,p)  -»  g (x , p) <g (y , p)  when  x  is  an  optimal 
sequence  reaching  state  q.  Thus  (1)  -»  h  is  s '-monotonic. 

(if):  Suppose  now  that  h  is  s '-monotonic.  If  (1)  does  not 
hold  then  for  some  sequence  xa  such  that  t(xa)=q,  we  have 
G (q , p) =g (xa ,p)  but  G (q ' , p) ^g (x , p)  where  t(q',a)=q.  For  some  yGA* 
such  that  t(x)=t(y)  we  have  G (q ' ,p) =g (y , p) <g (x ,p) .  If 
g  (ya  ,p)  =g  (xa  , p)  =G  (q,p)  then  h  is  not  s'-monotonic  (with  respect 
to  optimal  sequence  xa) ,  so  we  must  have  g (ya ,p) >g (xa ,p) .  But 
since  h  is  s'-monotonic  we  have  g (y ,p) >g (x,p)  which  contradicts 
our  earlier  finding  that  q  (y ,  p)  <g  (x ,  p)  .  Thus  (1)  must  hold.  QED 

In  practice  we  wish  to  find  optimal  policies  between  states. 
We  define  below  the  tables  T(q,p)  which  store  the  information 
necessary  to  obtain  optimal  policies.  Formally  for  all  q6Q,p6P 
T(q,p)  is  a  subset  of  QxA.  (T:0xP->2^xA) .  A  set  of  policies 
9(q,p)  are  obtainable  from  the  tables  T(q,p)  as  follows:  let 

9(c|s»P)  =  {(qs>e)},  where  e  is  the  empty  string, 

9(q,p)  =  {ya I  ( q * ,a)6T(q,p)  and  yee(q',p)}  for  q^qs» 

A  ddp  D=(A,S,P,f)  is  strongly- represented  (weakly- represented)  by 
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a  sdp  n=(A,Q,q0,Qf,T,h,k,P)  if  i)  II  represents  D,  ii)  the  func¬ 
tional  equations  (2)  and  (3)  given  below  hold  and  iii)  for  q€Q, 
p€P  the  set  of  policies  obtainable  from  the  tables  T(q,p)  is  the 
set  (subset)  of  all  optimal  policies;  in  particular 

U  0  (q, p)  =0  (Ilf P)  (  U  0  (q,p)C0  (TLp)  for  a  weak  representa- 
qeQf  qSQf 

tion)  . 

G(qs,p)=k  (2) 

G(q,P)=  .  x  min  h (G (q 1 ,p) ,q '  ,  a  ,p)  (3) 

{ (q' , a) I t (q  fa)=q) 

T(q,p)  =  { (q ' ,a)  1 1  (q '  ,a)=q,  G (q , p) =h (G (q ' ,p)  ,q ' , a ,p) }  (4) 

Note  that  if  n  strongly  (weakly)  represents  D  then  by  (i) 

0  (TTr  P )  =0  (D,p)  and  thus  U  0(q,p)  =  0(0, p)  (  U  0  (q ,  p)  CO  (D ,  p)  ) 

q0Qf  q0Qf 

i.e.,  the  construction  of  the  tables  0  by  means  of  (2)  ,  (3)  ,  and 
(4)  results  in  the  construction  of  all  (a  nonempty  subset  of) 
optimal  solutions  to  the  ddp  D. 

Lemma  1_.  xG0(q,p)  -»  x  is  an  optimal  sequence  reaching  state  q. 

proof:  the  lemma  follows  immediately  from  the  stronger  lemma  2 
which  is  given  in  the  appendix. 

We  do  not  require  that  an  optimal  sequence  have  the  same  cost  in 
D  as  in  II.  Our  interest  is  in  obtaining  optimal  solutions  and  in 
making  use  of  the  functional  equations  (2)  and  (3) .  These  equa- 
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tions  are  characteristic  of  dynamic  programming  and  are  often 
considered  a  direct  translation  of  the  principle  of  optimality. 
We  take  (1)  as  a  more  direct  translation  and  show  next  that  in 
the  sense  of  a  strong  representation  (1)  and  the  equations  (2) 
and  (3)  are  equivalent. 

Theorem  2*  A  ddp  D=(A,S,P,f)  is  strongly-represented  by  an  sdp 
II=(A,Q,q0,Qf,T,h,k,P)  iff  I!  represents  D  and  (1)  holds. 

proof:  (if):  Suppose  that  (1)  holds  and  II  represents  D.  In 

order  to  show  that  the  ddp  D  may  be  strongly-represented  by  an 

sdp  II,  we  must  show  that  n  represents  D  (which  we  have  assumed), 

(2)  and  (3)  hold,  and  that  all  optimal  policies  may  be  obtained 

from  the  tables  defined  by  (4).  First,  (2)  holds  by  definition 

of  G.  Let  H (q , p)  denote  the  right  hand  side  of  (3).  We  will 

show  that  G (q , p) =H (q , p) .  Suppose  that  y a  is  an  optimal  policy 

reaching  state  q,  so  G (q , p) =g (ya ,p) .  Since  (1)  holds  we  then 

have  G (§,p)=q  (y,p)  where  t(q,a)=q.  Thus  G(q,p)  =  h (g  (y ,p) ,q,a,p) 

=  h(G  (§,p)  ,§,a,p)  >  min  h  (G  (q  '  ,  p)  ,q  '  ,  a  ,p)  =H(q,p), 

(q’  , a)  1 1 (q  ,a)=q} 

or  G (q,p) >H (q , p) . 

Now  let  H (q, p) =h (G (q , p) ,q , a , p)  for  some  q6Q  and  suppose 
G  (cj  ,p)  =g  (y  ,p)  where  t(y)=§.  i.e.,  y  is  an  optimal  policy  reach¬ 
ing  3.  Let  t(ya)=q  then  G  (q ,  p)  <q  (ya  ,p)  =  h (g  (y , p) ,§ , a  ,p)  = 

h (G (3 ,p) ,3 ,a ,p)  =  H (q , p) ,  thus  G (q , p) <H (q , p) .  Combining  these 

results  we  have  G (q , p) =H (q , p)  and  (3)  holds. 

By  lemma  1  all  policies  in  0(q,p)  are  optimal  with  respect 
to  h.  Suppose  though  that  not  all  optimal  policies  can  be 
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obtained  from  (4).  Let  xa  be  an  optimal  policy  of  shortest 
length  reaching  state  q  which  is  not  in  0(t(xa),p).  Let  t(x)=q'. 
By  (1)  x  is  optimal  thus  xG9(q',p)  (since  x  has  shorter  length 
than  xa)  and  G (q ' ,p) =g (x , p) .  Since  xa  ■<?  0(t(xa),p)  we  must  have 
G (t (xa) ,p)  <  h (G (q ' ,p) ,q ' , a  ,p)  =  h (g  (x , p) ,q ' , a ,p)  =  g(xa,p),  but 
this  contradicts  our  assumption  that  xa  is  an  optimal  sequence 
reaching  state  q.  Therefore  (q ' , a ) GT (q ,p)  and  by  definition 
xaG0(q,p),  so  0{q,p)  is  the  set  of  all  optimal  sequences  reaching 

state  q.  In  particular  U  0(q,p)  =  OfllrP). 

qGQf 

(only  if) :  Suppose  now  that  the  ddp  D  is  strongly- 
representable  by  the  sdp  IT.  For  some  qGQ,  xGA*  we  are  able  to 
obtain  all  optimal  policies  reaching  state  q  using  (2),  (3),  and 
(4).  consider  xa€9(q,p)  where  t(xa)=q,  t(x)=q'.  By  lemma  1  xa 
is  an  optimal  sequence  reaching  state  q.  By  definition 
x69(q',p),  and  by  lemma  1  x  is  an  optimal  policy  reaching  q',  so 
G (q ' ,p) =g ( x , p) .  Thus  (1)  holds.  II  represents  D  by  assumption. 
QED 

Corollary  _1 .  A  ddp  D=(A,S,P,f)  is  strongly-represented  by  a  sdp 
II=(A,Q,q0,Qf,T,h,k,P)  iff  II  represents  D  and  II  is  a  s'-msdp. 

proof:  immediate  from  theorems  2  and  3. 

The  s '-monotonicity  of  the  cost  function  of  an  sdp  is  an 
essential  ingredient  in  a  strong  representation  of  a  ddp.  It  can 
be  shown  however  that  any  s'-monotonic  cost  function  is  effec¬ 
tively  equivalent  to  some  str ictly-monotonic  cost  function. 


Given  a  s'-monotonic  function  h,  define  the  function  g'  (and 
thereby  h'  implicitly)  as  follows: 

fg(xa,p)  if  G (q , p) =g (xa ,p) 

g' (xa,p)  ■  <  (5) 

/  G  (q,p) +g ' (x,p)  otherwise. 

Define  G'(q,p)  =  min  g'(x,p).  Note  that  by  definition 

{q I t (x ) =q } 

G  (q, p)  =G  '  (q , p)  for  all  states  q  and  inputs  p.  Lemma  4  given  in 
the  appendix  establishes  the  effective  equivalence  of  h  and  h'  in 
the  sense  that  the  set  of  optimal  sequences  obtained  for  each 
state  is  the  same  for  both  cost  functions. 

Lemma  2*  If  h  is  s'-monotonic  then  h'  defined  by  (5)  is  strictly 
monotonic. 

proof:  Let  h'  be  defined  from  the  s'-monotonic  function  h  by  (5). 
Suppose  for  x,yeA*  such  that  t(x)=t(y),  we  have  g ' (x,p) <g ' (y,p) . 
We  have  2  cases  to  consider  in  order  to  show  that 
g ' (xa ,p) <g ' (ya ,p) .  Let  aSA  such  that  t(xa)=q.  Case  1:  y a  is  not 
optimal.  By  construction  of  g',  g  '  (ya  ,p)  =G  (q,  p) +g  '  (y , p)  and 
g'(xa,p)  has  the  value  G(q,p)  or  G (q , p) +g ' (x , p)  either  of  which 
is  strictly  less  than  g'(ya,p).  Case  2:  ya  is  an  optimal 
sequence  reaching  state  q.  If  ya  is  optimal  then 
g' (ya,p)=g  (ya,p) =G (q,p) .  Also  by  theorem  2,  (1)  holds  so  y  is  an 

optimal  sequence;  i.e.,  g'(y,p)  =  g(y,p)  =  G(q',p)  =  G'(q',p), 

but  this  contradicts  our  assumption  that  g ' (x,p) <g ' (y,p)  = 

G ' (q ' , p) .  QED 
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Theorem  £.  A  ddp  D=(A,S,P,f)  is  strongly  represented  by  a  sdp 
II-(A,Q,q0,Qf,T,h,k,P)  iff  there  is  a  strictly  monotonic  sdp 
n’=(A,Q,q0,Qf,T,h',k,P)  which  strongly  represents  D. 

proof:  (only  if):  Clearly  any  s-msdp  is  an  s’-msdp  so  by  corol¬ 
lary  1  the  statement  of  the  theorem  is  consistent  and  D  is 
strongly  represented  by  XI *  • 

(if):  Suppose  that  D  is  strongly  represented  by 
II=(A,Q,q0,Qf,T,h,k,P),  then  by  corollary  1  h  is  a  s'-monotonic 
cost  function.  Consider  h*  defined  by  (5)  which  is  s-monotonic 
by  lemma  3.  We  need  to  show  that  XT'  =  (A, Q  ,qQ  ,Qf  ,T , h '  ,k ,  P) 
strongly  represents  D.  (2)  holds  by  definition.  In  order  to 
show  that  (3)  holds,  let  xa  be  an  optimal  sequence  reaching  state 
q.  By  construction  G (q, p) =G ' (q , p)  for  all  states  q9Q.  Equation 
(3)  then  holds  for  G'  since  it  holds  for  G  by  corollary  1.  Equa¬ 
tion  (4)  holds  since  lemma  4,  given  in  the  appendix,  shows  that 
9 '  (q, p) =9  (q , p)  so  9 ' (q , p)  is  the  set  of  all  optimal  sequences 
reaching  state  q.  Finally  XT  *  represents  D  since  F(n')=F(ri)=S  and 

0(D,p)=0(II,P)  =  U  9  (q ,  p)  =  U  9'(q,p)  =  0(ir,p).  QED 
q€Qf  qGQf 

5 .  Weak  representations  of  a  discrete  decision  problem . 

We  have  been  looking  at  the  conditions  under  which  we  can 
find  all  optimal  decision  sequences  reaching  any  state  from  qQ. 
In  practice  we  may  relax  this  requirement  and  he  satisfied  with 
some  (or  just  one)  optimal  sequences  to  each  state  in  0.  We  now 
explore  the  conditions  under  which  this  requirement  can  be  satis- 
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fled. 


We  have  seen  how  a  direct  translation  of  the  principle  of 
optimality  helped  to  establish  the  conditions  for  its  applica¬ 
tion.  In  the  more  general  situation  faced  now  it  may  be  helpful 
to  give  a  generalized  principle  of  optimality  which  applies  when 
we  are  interested  in  obtaining  only  some  optimal  decision 
sequences. 

Generalized  principle  of  optimality  (forward  version) :  If  there 
is  an  optimal  sequence  reaching  state  q,  then  there  is  an  optimal 
sequence  reaching  state  q  with  the  property  that  no  matter  what 
the  last  decision  and  last  state  q'  were,  the  sequence  reaching 
q'  is  an  optimal  sequence. 

Given  p6P ,  a  sequence  xa  is  1-optimal  if  G  (t  (xa)  ,p) =g (xa ,p)  and 
G(t(x) ,p)=g(x,p) .  This  generalized  principle  of  optimality  can 
be  formalized  as  follows: 

(VpGP) (VqGQ)  there  is  a  1-optimal  sequence  reaching  state  q 

In  these  terms  we  can  reformulate  the  (original)  principle  of 
optimality  as  follows:  VpGP  VqGQ  every  optimal  sequence  reaching 
state  q  is  1-optimal.  Condition  (6)  can  be  expressed  soley  in 
terms  of  the  cost  function  h  as  given  below  in  theorem  5.  h  is 
b-monotonic  if  for  all  qGQ,  some  optimal  sequence  xa  reaching  q, 
and  sequence  yaeA*  reaching  q,  we  have  g (xa ,p) <g (ya ,p)  -» 
g  (x,p)  <g  (y,p)  .  A  sdp  T!=(A,Q,q0,Qf  ,T,h,k,P)  in  which  h  is  b- 
monotonic  is  a  b-monotonic  sequential  decision  process  (b-mdsp) . 
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Theorem  5^  (6)  holds  iff  h  is  b-monotonic. 

proof:  (if):  Consider  an  arbitrary  state  qGQ  and  let  h  be  b- 
monotonic.  We  will  show  there  exists  a  1-optimal  sequence  reach¬ 
ing  state  q.  Let  xa  be  an  optimal  sequence  reaching  state  q. 
Let  P(q)  denote  the  set  of  sequences  such  that  yGP(q)  iff  t(y)=q. 
Partition  P(t(x))  into  two  sets  as  follows:  let 

Y  ( x , a)  =  {y lyGP (t (x) )  ,  g (xa ,p) =q (ya ,p) ,  g  (x , p)  >g  (y ,  p)  } 
Z(x,a)  =  { z I zGP (t (x) )  ,  g (xa ,p) <g  (za ,d) }  U 

{ z I zGP (t (x) )  ,  g (xa ,p) =g (za ,p) ,  g (x,p) <g (z,p) } 

For  any  zGZ(x,a)  we  have  g (x , p) £g (z , p) ,  either  by  the  monotoni¬ 
city  of  h  in  the  case  that  g (xa ,p) <g (za ,p)  or  by  definition  in 
the  other  case.  Thus  if  Y(x,a)  is  empty  then  G ( t (x) ,p) =g (x , p) 
and  xa  is  a  1-optimal  sequence  reaching  state  q.  On  the  other 

hand  if  Y(x,a)  is  nonempty,  we  have  y'=  min  g(y,p)  for  some 

yGY (x , a) 

y ' GY ( x , a )  .  Then  g (y ' , p) <g (y , p)  for  all  y6Y(x,a),  and 
g(y'»P)<q(x,p)<g(z,p)  for  all  zGZ(x,a),  thus  G ( t ( x ) , p) =g (y ' , p) . 
But  g (y ' a , p) =g (xa ,p) =G (q , p) ,  so  y'a  is  a  1-optimal  sequence 
reaching  state  q. 

(only  if):  SupDose  now  that  (f)  holds.  For  an  arbitrary 
state  q,  let  G (q , p) =q ( xa ,p)  and  G  (q  '  ,  p)  =g  (x  ,  p)  where  t(q',a)=q 
and  t(x)=q';  i.e.,  xa  is  1-optimal  sequence  reaching  state  q. 
Suppose  that  h  is  not  b-monotonic,  so  for  some  sequence  ya  we 
have  g (xa ,p) <g (ya ,p)  and  g  (x  ,  p)  >g  (y ,  p)  .  By  the  1-optimality  of 
xa  we  have  g (x , p) =G (q ' , p) <g (y , p) .  Furthermore  we  must  have 
g (x,p) <g (y,p)  since  g (x , p) =g  (y , p)  -»  h (g (x , p) , t (x) , a ,p) 

h (g (y , p) , t (x) , a ,p) ;  i.e.,  g (xa ,p) =g (ya ,p) .  This  contradiction 
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shows  that  h  is  b-monotonic.  QED 


Theorem  6.  A  ddp  D*(A,S,P,f)  is  weakly-represented  by  a  sdp 
n*(A,Q,q0,Q£,T,h,k,P)  iff  II  represents  D  and  (fi)  holds. 

proof:  (if):  Suppose  that  the  ddp  D=(A,S,P,f)  is  weakly- 

represented  by  a  sdp  H=  (A ,  Q  ,q0  ,Qf  ,T  ,h  ,  k  ,  P)  .  By  definition  II 

represents  D.  Now  let  q  be  an  arbitrary  state.  By  (2)  , 

G(q,p)*  min  h  (G  (q  ’  ,  p)  #  q  '  ,  a  ,  p)  .  Let  G(q,p) 

{ (q  * ,a) It (q ' ,a)=q} 

h  (G  (§,p) ,§,a,p)  and  let  G(^,p)=g(y,p)  ,  then  G(q,p) 
h(G(§,p)  ,3,a,p)  =  h  (g  (y ,  p) ,  a ,p)  =  g(ya,p).  We  have  just  shown 

that  ya  is  a  1-optimal  sequence  reaching  state  q.  Thus  (5) 
holds. 

(only  if):  Suppose  now  that  IT  represents  0  and  (6)  holds. 
For  any  state  qGQ,  there  exists  a  sequence  xa  such  that  t(xa)=q, 
G(q,p)=g(xa,p)  ,  and  G  (<9,o)  =g  (x,p)  .  G(q,p)  =  g(xa,p) 

h (g (x,p) ,^,a,p)  =  h (G  (§,p) ,^,a,p)  which  implies  that  we  can  find 

the  value  G(q,p)  by  minimizing  the  expression  h (G (q ' ,p) ,q ' , a ,p) 
over  all  q'6Q,  aeA  such  that  t(q',a)=q,  and  thus  we  get  (3).  (2) 

follows  by  definition.  By  definition  all  elements  of  9(q,p)  are 
optimal  sequences  which  reach  state  q.  To  see  that  0(q,p)  is 
nonempty,  note  that  since  (6)  holds  there  is  a  sequence  xa  such 
that  G (q,p)=g (xa ,p)  and  G (q ' ,p) =g (x , p)  where  T(q',p)=q  and  by 
definition  such  an  xa  is  in  0(q,p).  Finally  IT  represents  D  by 
assumption.  QED 
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Corollary  2.  A  ddp  D=(A,S,P,f)  is  weakly-representable  by  a  sdp 
nm(A,Q,q0*Qf ,T,h,k,P)  iff  n  represents  D  and  tl  is  a  b-msdp. 

proof:  immediate  from  theorems  5  and  6. 

We  have  now  characterized  the  classes  of  sdp's  which  weakly 
and  strongly  represent  ddp's.  The  difference  between  these  two 
types  of  representations  is  illustrated  in  figure  2.  Here  h  is 
b-monotonic  but  h  is  not  s '-monotonic.  According  to  equation 
(3) ,  in  order  to  determine  an  optimal  sequence  reaching  q,  we 
consider  an  extension  of  an  optimal  sequence  reaching  q'.  But  in 
restricting  the  search  to  optimal  sequences  reaching  q',  equation 
(3)  overlooks  the  optimal  sequence  ya  reaching  q.  This  illus¬ 
trates  why  b-msdp's  can  only  weakly-represent  a  ddp. 

The  conditions  established  for  the  weak-representation  of  a 
ddp  are  necessary  in  order  to  take  care  of  fairly  pathological 
cost  functions.  It  can  he  shown  however  that  the  cost  function 


of  any  sdp  which  weakly  represents  is  equivalent  to  other  cost 


functions  with  nicer  properties.  Given  a  cost  function  h  which 
is  b-monotonic,  define  the  function  g'  (and  thereby  h')  as  fol¬ 
lows  : 

C g  (x,p)  if  x  is  1-optimal 

g' (x,p)=  \  (7) 

1 G ( t ( x ) , p) +1  otherwise. 

Define  G'(q,p)=  min  g'(x,p).  Lemma  4  given  in  the  appendix 
t  ( x )  =q 

establishes  the  effective  equivalence  of  h  and  h'  in  the  sense 
that  the  set  of  optimal  sequences  obtained  for  each  state  is  the 
same  for  both  cost  functions. 

h  is  monotonlc  if  Vx,y€A*  VaGA  such  that  t(x)=t(y) 
g (x,p) <g  (y,p)  -»  g (xa ,p)£g (ya ,p) .  An  sdp  with  cost  function  h 

which  is  monotonic  is  a  monotonic  sequential  decision  process 
(m-sdp) . 

Lemma  If  for  some  sdp  11=  ( A  ,  Q ,  q0  ,  Q  f  ,  T  ,  h  ,  k  ,  P)  h  is  b-monotonic 

then  h'  defined  by  (7)  is  monotonic. 

proof:  Consider  the  function  h'  defined  in  (7).  h'  can  be  shown 
to  be  monotonic  as  follows.  Let  t  (x ) =t (y) =q ' ,  t(q',a)=q  and 
g  1  (x  ,  p)  <g  '  (y ,  p)  .  If  xa  is  1-optimal  then  g ' (xa ,p)=g (xa,p)=G (q,p) 
and  since  g’(ya,p)  has  the  value  G(q,p)  or  G(q,p)+1, 
g ' (xa ,p) <g '  (ya ,p) .  Suppose  now  that  ya  is  1-optimal,  then 
G (q,p)=g (ya,p)  and  G (q ' ,p) =g (y,p)  ,  g ' (ya ,p) =g  (ya ,p)  and 
g' (y»P)=g(yrP)=G(q' ,p)=g' (x,p)  (since  g ' (x,p) <g ' (y ,p) .  But  if 
g ' (x , p) =g ' (y , p)  then  g'(xa,p)  =  h ' (g ’ (x , p) ,q ' , a ,p) 
h  '  (g  '  (ya  ,p)  ,q  '  ,a  ,p)  =  g'(y,p).  (thus  g  '  (xa  ,p)  <g  '  (ya  ,p)  )  .  If 
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neither  xa  nor  ya  is  1-optimal  then  g ' (xa ,p) =g ' (ya ,p) =G (q,p) +1 . 
In  all  cases  the  monotonicity  of  h'  is  shown.  QED 


The  following  result  is  well  known  (Elmaghraby  1970,  Bonzon 
1970)  in  the  sense  that  dynamic  programs  are  in  one  to  one 
correspondence  with  monotonic  sdp's.  However  to  the  author's 
knowledge  it  has  not  been  pointed  out  that  m-sdp's  can  only 
weakly  represent  a  ddp;  i.e.,  one  is  not  guaranteed  to  be  able  to 
obtain  all  optimal  solutions  from  a  representation  by  a  m-sdp. 

Theorem  7_*  A  ddp  D=(A,S,P,f)  is  weakly-represented  by  some  sdp 
n=(A,Q,q0,Qf ,T,h,k,P)  iff  there  is  a  m-sdp 
rr=(A,Q,q0,Qf ,T,h' ,k,P)  which  weak ly-represents  D. 

proof:  (if):  We  must  show  that  a  m-sdp  can  represent  ?. .  Let  xa 
be  an  optimal  sequence  reaching  q,  so  G (q , p) =g (xa , p) .  Suppose 
g  (xa ,p) <g  (ya ,p)  yet  g  (x,p) >g (y,p) .  By  the  monotonicity  of  h',  we 
get  g (xa ,p) >g (ya ,p)  which  contradicts  our  assumption.  Thus 
g  (x,p) >g (y,p)  and  h'  is  b-monotonic.  By  corollary  2,  TI '  weakly- 
represents  D. 

(only  if) :  Suppose  that  D  is  weakly-represented  by  an  sdp 
n=(A,Q,q0,Qf,T,h,k,P) ,  and  h’  is  defined  by  (7)  from  h,  then  by 
corollary  2,  h  is  b-monotonic  and  by  lemma  5  h'  is  monotonic. 

We  can  show  that  D  is  weakly-represented  by  the  sdp 
n,=(A,Q,q0,Qf,T,h',k,P) .  (2)  holds  by  definition.  Let  x€A*  be  a 
1-optimal  sequence  reaching  state  qSQ  so  G (q , p) =g (x , p) .  Such  a 
sequence  exists  by  theorem  6.  By  construction  g '  (x,p) =g  (x,p)  so 
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G ' (q,p) *G (q,p)  for  all  states  q€Q.  Equation  (3)  must  hold  for 

G'(q,p)  since  it  holds  for  G(q,p)  as  a  result  of  corollary  2. 

Lemma  4  shows  that  0 (q , p) =9 ' (q , p)  so  0'(q,p)  is  a  nonempty  subset 

of  optimal  sequences.  Finally  IT'  represents  D  since  F(rt')=F(n)-S 

and  OflT  #P)  =  U  0'(q,p)  =  U  0(q,p)  =  0(11, p)  C  0(D,p).  QED 
qSQf  q6Qf 


6.  Conclusion. 


This  paper  has  given  necessary  and  sufficient  conditions  for 
the  strong  and  weak  representation  of  a  discrete  decision  problem 
by  a  sequential  decision  process.  Strictly  monotonic  (monotonic) 
sequential  decision  processs  have  been  shown  to  be  equivalent  in 
the  strong  (weak)  representation  sense  to  the  class  of  discrete 
decision  problems  which  can  be  formulated  as  discrete  dynamic 
programs.  We  have  shown  that  the  problems  to  which  the  principle 
of  optimality  applies  are  a  subclass  of  the  problems  to  which  the 
functional  equations  of  dynamic  programming  are  applicable. 

Appendix 

In  order  to  establish  lemma  1  we  will  need  the  following  defini¬ 
tion  and  lemma.  We  say  x€A*  is  complete ly-opt imal  if  every  ini¬ 
tial  segment  (every  y6A*  such  that  there  exists  zSA*  such  that 
yz=x)  y  of  x  is  1-optimal. 

Lemma  2.  xaG9(q,p)  iff  xa  is  completely  optimal. 

proofs  by  induction  on  the  length  of  a  sequence.  Let  the  length 
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of  x  be  1,  i.e.,  xSA.  (qg , x) 60 (q , p)  iff  x6T(q,p)  and  eG9(q,p) 
where  e  is  the  empty  sequence  and  t(x)=q.  By  definition 
e60(qs,p)  and  x69(q,p)  iff  G  (q , p)  =g  (x , p)  iff  x  is  an  optimal 
sequence . 

Induction  step:  Assume  that  the  lemma  holds  for  any 
sequence  of  length  <m  and  let  the  length  of  the  sequence  xa  be  m. 
xa69(q,p)  iff  (q ' ,p) 9T (q, p)  and  xG0(q',p)  where  T(q’,p)=q.  By 
induction  hypothesis  xGGfq'.p)  iff  x  is  completely  optimal.  This 
implies  that  G (q ' , p) =g (x , p) .  Also  (q ' , p) GT (q ,p)  iff 

G(q,p)=h(G(q' ,p) ,  q ' ,  a  ,p)  =  h (g ( x , p) , q ' , a ,p) =g  (xa , p) .  (xa  is  1- 

optimal  and  x  is  completely  optimal  -»xa  is  completely  optimal), 
i.e.,  xa  is  completely  optimal.  QED 

The  following  lemma  establishes  the  effective  equivalence  of  h 
and  h'  defined  by  (5)  in  the  sense  that  the  set  of  optimal 
sequences  obtained  for  each  state  is  the  same  for  both  cost  func¬ 
tions.  The  lemma  also  holds  true  for  h'  defined  by  equation  (7). 

Lemma  £.  VqGO ,  VpGP  9(q,p)=9' (q,p) . 

proof:  xG9(q,p)  iff  x  is  completely  optimal  (by  lemma  2), 

iff  x=aia2*',an  and  a^-.-a-  is  1-optimal  witn  respect  to  h  for 
l  1 ,  .  .  .  ,  n 

iff  g' (a1,p)=g(a1,p)=G' (t(aj) ,p)  and  ...  and  g ' (aj. . .an,p)  = 
g(a^...a  ,p)  =  G (t (a^ • • *an) ,p)  bv  construct  ion  , 
iff  x  is  completely  optimal  with  respect  to  h', 
iff  x69' (q,p)  (by  lemma  2). 

QED 
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